Home:ALL Converter>Accessing Hadoop with Python

Accessing Hadoop with Python

Ask Time:2022-01-06T17:02:09         Author:subh

Json Formatter

I am new to data engineering field and currently learning about Hadoop file system and its uses. I want to perform few Hadoop commands from my python script that i could run so that all the hdfs commands get executed in a sequence. The job that i want to perform are:

  1. copy a file from local to hdfs
  2. download a file from hdfs to local
  3. Read various kinds of file such as text,avro,csv and parquet files stored in hdfs.

I want all of these tasks to be performed from a python script and not by typing the respective commands from the terminal. Do help me out and please let me know if some library or module exists with which i can perform this.

Hadoop version is 3.2.1, python version is 3.8.

Thank you!

Author:subh,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/70604717/accessing-hadoop-with-python
yy